Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Genet Sel Evol ; 56(1): 29, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38627636

RESUMO

BACKGROUND: With the introduction of digital phenotyping and high-throughput data, traits that were previously difficult or impossible to measure directly have become easily accessible, offering the opportunity to enhance the efficiency and rate of genetic gain in animal production. It is of interest to assess how behavioral traits are indirectly related to the production traits during the performance testing period. The aim of this study was to assess the quality of behavior data extracted from day-wise video recordings and estimate the genetic parameters of behavior traits and their phenotypic and genetic correlations with production traits in pigs. Behavior was recorded for 70 days after on-test at about 10 weeks of age and ended at off-test for 2008 female purebred pigs, totaling 119,812 day-wise records. Behavior traits included time spent eating, drinking, laterally lying, sternally lying, sitting, standing, and meters of distance traveled. A quality control procedure was created for algorithm training and adjustment, standardizing recording hours, removing culled animals, and filtering unrealistic records. RESULTS: Production traits included average daily gain (ADG), back fat thickness (BF), and loin depth (LD). Single-trait linear models were used to estimate heritabilities of the behavior traits and two-trait linear models were used to estimate genetic correlations between behavior and production traits. The results indicated that all behavior traits are heritable, with heritability estimates ranging from 0.19 to 0.57, and showed low-to-moderate phenotypic and genetic correlations with production traits. Two-trait linear models were also used to compare traits at different intervals of the recording period. To analyze the redundancies in behavior data during the recording period, the averages of various recording time intervals for the behavior and production traits were compared. Overall, the average of the 55- to 68-day recording interval had the strongest phenotypic and genetic correlation estimates with the production traits. CONCLUSIONS: Digital phenotyping is a new and low-cost method to record behavior phenotypes, but thorough data cleaning procedures are needed. Evaluating behavioral traits at different time intervals offers a deeper insight into their changes throughout the growth periods and their relationship with production traits, which may be recorded at a less frequent basis.


Assuntos
Comportamento Alimentar , Suínos/genética , Feminino , Animais , Fenótipo , Modelos Lineares
2.
J Anim Sci ; 1022024 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-38576313

RESUMO

Accurate genetic parameters are crucial for predicting breeding values and selection responses in breeding programs. Genetic parameters change with selection, reducing additive genetic variance and changing genetic correlations. This study investigates the dynamic changes in genetic parameters for residual feed intake (RFI), gain (GAIN), breast percentage (BP), and femoral head necrosis (FHN) in a broiler population that undergoes selection, both with and without the use of genomic information. Changes in single nucleotide polymorphism (SNP) effects were also investigated when including genomic information. The dataset containing 200,093 phenotypes for RFI, 42,895 for BP, 203,060 for GAIN, and 63,349 for FHN was obtained from 55 mating groups. The pedigree included 1,252,619 purebred broilers, of which 154,318 were genotyped with a 60K Illumina Chicken SNP BeadChip. A Bayesian approach within the GIBBSF90 + software was applied to estimate the genetic parameters for single-, two-, and four-trait models with sliding time intervals. For all models, we used genomic-based (GEN) and pedigree-based approaches (PED), meaning with or without genotypes. For GEN (PED), heritability varied from 0.19 to 0.2 (0.31 to 0.21) for RFI, 0.18 to 0.11 (0.25 to 0.14) for GAIN, 0.45 to 0.38 (0.61 to 0.47) for BP, and 0.35 to 0.24 (0.53 to 0.28) for FHN, across the intervals. Changes in genetic correlations estimated by GEN (PED) were 0.32 to 0.33 (0.12 to 0.25) for RFI-GAIN, -0.04 to -0.27 (-0.18 to -0.27) for RFI-BP, -0.04 to -0.07 (-0.02 to -0.08) for RFI-FHN, -0.04 to 0.04 (0.06 to 0.2) for GAIN-BP, -0.17 to -0.06 (-0.02 to -0.01) for GAIN-FHN, and 0.02 to 0.07 (0.06 to 0.07) for BP-FHN. Heritabilities tended to decrease over time while genetic correlations showed both increases and decreases depending on the traits. Similar to heritabilities, correlations between SNP effects declined from 0.78 to 0.2 for RFI, 0.8 to 0.2 for GAIN, 0.73 to 0.16 for BP, and 0.71 to 0.14 for FHN over the eight intervals with genomic information, suggesting potential epistatic interactions affecting genetic trait architecture. Given rapid genetic architecture changes and differing estimates between genomic and pedigree-based approaches, using more recent data and genomic information to estimate variance components is recommended for populations undergoing genomic selection to avoid potential biases in genetic parameters.


Genetic parameters are used to predict breeding values for individuals in breeding programs undergoing selection. However, inaccurate genetic parameters can cause breeding values to be biased, and genetic parameters can change over time due to multiple factors. This study aimed to investigate how genetic parameters changed over time in a broiler population using time intervals and observing the behavior of single nucleotide polymorphism (SNP) effects. We studied four traits related to production and disorders while also studying the impact of using genomic information on the estimates. Genetic variances showed an overall decreasing trend, whereas residual variances increased during each interval, resulting in decreasing heritability estimates. Genetic correlations between traits varied but with no major changes over time. Estimates tended to be lower when genomic information was included in the analysis. SNP effects showed changes over time, indicating changes to the genetic background of this population. Using outdated variance components in a population under selection may not represent the current population. Furthermore, when genomic selection is practiced, accounting for this information while estimating variance components is important to avoid biases.


Assuntos
Galinhas , Polimorfismo de Nucleotídeo Único , Seleção Genética , Animais , Galinhas/genética , Masculino , Feminino , Cruzamento , Linhagem , Genótipo , Doenças das Aves Domésticas/genética , Genômica , Fenótipo , Teorema de Bayes , Modelos Genéticos
3.
J Anim Breed Genet ; 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38523564

RESUMO

Estimating heritabilities with large genomic models by established methods such as restricted maximum likelihood (REML) or Bayesian via Gibbs sampling is computationally expensive. Alternatively, heritability can be estimated indirectly by method R and by maximum predictivity, referred to as MaxPred here, at a much lower computing cost. By method R, the heritability used for predictions with whole and partial data is considered the best estimate when the predictions based on partial data are unbiased relative to those with the complete data. By MaxPred, the heritability estimate is the one that maximizes predictivity. This study compared heritability estimation with genomic information using average information REML (AI-REML), method R and MaxPred. A simulated population was generated with ten generations of 5000 animals each and an effective population size of 80. Each animal had one record for a trait with a heritability of 0.3, a phenotypic variance of 10.0 and was genotyped at 50 k SNP. In method R, the heritability estimate is found when the expectation of a regression coefficient is equal to one. The regression is the EBV of selection candidates calculated with the whole dataset regressed on the EBV of candidates calculated from a partial dataset. In this study, we used the GBLUP framework and therefore, GEBV was calculated. The partial dataset was created by removing the last generation of phenotypes. Predictivity was defined as the correlation between the adjusted phenotypes of the selection candidates and their GEBV calculated from the partial data. We estimated the heritability for populations that included between three and 10 generations. In every scenario, predictivity increased as more data was used and was the highest at the simulated heritability. However, the predictivity for all data subsets and all heritabilities compared did not differ more than 0.01, suggesting MaxPred is not the best indication for heritability estimation. For the whole dataset, the heritability was estimated as 0.30 ± 0.01, 0.26 ± 0.01 and 0.30 ± 0.04 for AI-REML without genomics, AI-REML with genomics and method R with genomics, respectively. Heritability estimation with genomics by method R reduced timing by 83%, implying a reduction in computing time from 9.5 to 1.6 h, on average, compared to AI-REML with genomics. Method R has the potential to estimate heritabilities with large genomic information at a low cost when many generations of animals are present; however, the standard error can be high when only a few iterations are used.

4.
Genet Sel Evol ; 56(1): 18, 2024 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-38459504

RESUMO

BACKGROUND: Validation by data truncation is a common practice in genetic evaluations because of the interest in predicting the genetic merit of a set of young selection candidates. Two of the most used validation methods in genetic evaluations use a single data partition: predictivity or predictive ability (correlation between pre-adjusted phenotypes and estimated breeding values (EBV) divided by the square root of the heritability) and the linear regression (LR) method (comparison of "early" and "late" EBV). Both methods compare predictions with the whole dataset and a partial dataset that is obtained by removing the information related to a set of validation individuals. EBV obtained with the partial dataset are compared against adjusted phenotypes for the predictivity or EBV obtained with the whole dataset in the LR method. Confidence intervals for predictivity and the LR method can be obtained by replicating the validation for different samples (or folds), or bootstrapping. Analytical confidence intervals would be beneficial to avoid running several validations and to test the quality of the bootstrap intervals. However, analytical confidence intervals are unavailable for predictivity and the LR method. RESULTS: We derived standard errors and Wald confidence intervals for the predictivity and statistics included in the LR method (bias, dispersion, ratio of accuracies, and reliability). The confidence intervals for the bias, dispersion, and reliability depend on the relationships and prediction error variances and covariances across the individuals in the validation set. We developed approximations for large datasets that only need the reliabilities of the individuals in the validation set. The confidence intervals for the ratio of accuracies and predictivity were obtained through the Fisher transformation. We show the adequacy of both the analytical and approximated analytical confidence intervals and compare them versus bootstrap confidence intervals using two simulated examples. The analytical confidence intervals were closer to the simulated ones for both examples. Bootstrap confidence intervals tend to be narrower than the simulated ones. The approximated analytical confidence intervals were similar to those obtained by bootstrapping. CONCLUSIONS: Estimating the sampling variation of predictivity and the statistics in the LR method without replication or bootstrap is possible for any dataset with the formulas presented in this study.


Assuntos
Genômica , Modelos Genéticos , Humanos , Genótipo , Reprodutibilidade dos Testes , Intervalos de Confiança , Linhagem , Genômica/métodos , Fenótipo
5.
J Anim Sci ; 1012023 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-37837636

RESUMO

Genomic estimated breeding values (GEBV) of animals without phenotypes can be indirectly predicted using recursions on GEBV of a subset. To maximize predictive ability of indirect predictions (IP), the subset must represent the independent chromosome segments segregating in the population. We aimed to 1) determine the number of animals needed in recursions to maximize predictive ability, 2) evaluate equivalency IP-GEBV, and 3) investigate trends in predictive ability of IP derived from recent vs. distant generations or accumulating phenotypes from recent to past generations. Data comprised pedigree of 825K birds hatched over 12 overlapping generations, phenotypes for body weight (BW; 820K), residual feed intake (RF; 200K) and weight gain during a trial period (WG; 200K), and breast meat percent (BP; 43K). A total of 154K birds (last six generations) had genotypes. The number of animals that maximize predictive ability was assessed based on the number of largest eigenvalues explaining 99% of variation in the genomic relationship matrix (1Me = 7,131), twice (2Me), or a fraction of this number (i.e., 0.75, 0.50, or 0.25Me). Equivalency between IP and GEBV was measured by correlating these two sets of predictions. GEBV were obtained as if generation 12 (validation animals) was part of the evaluation. IP were derived from GEBV of animals from generations 8 to 11 or generations 11, 10, 9, or 8. IP predictive ability was defined as the correlation between IP and adjusted phenotypes. The IP predictive ability increased from 0.25Me to 1Me (11%, on average); the change from 1Me to 2Me was negligible (0.6%). The correlation IP-GEBV was the same when IP were derived from a subset of 1Me animals chosen randomly across generations (8 to 11) or from generation 11 (0.98 for BW, 0.99 for RF, WG, and BP). A marginal decline in the correlation was observed when IP were based on GEBV of animals from generation 8 (0.95 for BW, 0.98 for RF, WG, and BP). Predictive ability had a similar trend; from generation 11 to 8, it changed from 0.32 to 0.31 for BW, from 0.39 to 0.38 for BP, and was constant at 0.33(0.22) for RF(WG). Predictive ability had a slight to moderate increase accumulating up to four generations of phenotypes. 1Me animals provide accurate IP, equivalent to GEBV. A minimum decay in predictive ability is observed when IP are derived from GEBV of animals from four generations back, possibly because of strong selection or the model not being completely additive.


Genomic estimated breeding values (GEBV) of genotyped animals without phenotypes can be obtained by indirect predictions (IP) using recursions on GEBV from a subset. Our objectives were to 1) evaluate the number of animals needed in recursions to maximize predictive ability, 2) assess equivalency between IP and GEBV, and 3) investigate trends in predictive ability of IP derived from recent vs. distant generations or accumulating phenotypes from recent to past generations. The number of animals (7,131) in the recursions that provided high-predictive ability was equal to the number of largest eigenvalues explaining 99% of variation in the genomic relationship matrix. IP and GEBV were equivalent (correlation ≥ 0.98). IP predictive ability was similar when recursions were based on animals from recent or distant generations; it marginally decayed with animals from four generations apart. The decline in predictive ability can be explained by strong selection or the model not being fully additive. A slight to moderate increase in IP predictive ability was observed accumulating up to four generations of phenotypes. If GEBV of animals in the subset chosen for recursions are estimated using sufficient data, animals can be from up to four generations back without significant loss in predictive ability.


Assuntos
Galinhas , Modelos Genéticos , Animais , Galinhas/genética , Genoma , Genômica , Genótipo , Fenótipo , Linhagem
6.
J Anim Sci ; 1012023 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-37584978

RESUMO

Historical data collection for genetic evaluation purposes is a common practice in animal populations; however, the larger the dataset, the higher the computing power needed to perform the analyses. Also, fitting the same model to historical and recent data may be inappropriate. Data truncation can reduce the number of equations to solve, consequently decreasing computing costs; however, the large volume of genotypes is responsible for most of the increase in computations. This study aimed to assess the impact of removing genotypes along with phenotypes and pedigree on the computing performance, reliability, and inflation of genomic predicted breeding value (GEBV) from single-step genomic best linear unbiased predictor for selection candidates. Data from two pig lines, a terminal sire (L1) and a maternal line (L2), were analyzed in this study. Four analyses were implemented: growth and "weaning to finish" mortality on L1, pre-weaning and reproductive traits on L2. Four genotype removal scenarios were proposed: removing genotyped animals without phenotypes and progeny (noInfo), removing genotyped animals based on birth year (Age), the combination of noInfo and Age scenarios (noInfo + Age), and no genotype removal (AllGen). In all scenarios, phenotypes were removed, based on birth year, and three pedigree depths were tested: two and three generations traced back and using the entire pedigree. The full dataset contained 1,452,257 phenotypes for growth traits, 324,397 for weaning to finish mortality, 517,446 for pre-weaning traits, and 7,853,629 for reproductive traits in pure and crossbred pigs. Pedigree files for lines L1 and L2 comprised 3,601,369 and 11,240,865 animals, of which 168,734 and 170,121 were genotyped, respectively. In each truncation scenario, the linear regression method was used to assess the reliability and dispersion of GEBV for genotyped parents (born after 2019). The number of years of data that could be removed without harming reliability depended on the number of records, type of analyses (multitrait vs. single trait), the heritability of the trait, and data structure. All scenarios had similar reliabilities, except for noInfo, which performed better in the growth analysis. Based on the data used in this study, considering the last ten years of phenotypes, tracing three generations back in the pedigree, and removing genotyped animals not contributing own or progeny phenotypes, increases computing efficiency with no change in the ability to predict breeding values.


Recording data for long years is common in animal breeding and genetics. However, the larger the data, the higher the computing cost of the analysis, especially with genomic information. This study aimed to investigate the impact of removing data, namely, genotypes, phenotypes, and pedigree, on the computing performance and prediction ability of genomic breeding values. We tested four scenarios to remove genotyped individuals in pig populations. For each scenario, phenotypes were removed according to birth year, and the pedigree was either kept complete or traced back from two to three generations. Reliabilities for young, genotyped animals did not differ after removing genotypes for older or less important animals. However, using only two generations of data slightly reduces the reliability for young, genotyped animals. The dispersion did not change across the studied scenarios, and its worst value was observed when using only one generation in the pedigree. Using the last ten years of phenotypes, a pedigree depth of three generations, and removing genotyped animals not contributing own or progeny phenotypes reduces computing cost with no change in the ability to predict breeding values.


Assuntos
Genômica , Modelos Genéticos , Animais , Suínos/genética , Linhagem , Reprodutibilidade dos Testes , Fenótipo , Genômica/métodos
7.
JDS Commun ; 4(4): 260-264, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37521061

RESUMO

The dairy industry is known for its extensive use of artificial insemination, which has resulted in a population where most animals can be traced back to only a few sires. Due to their relatedness to the population, old influential sires could still contribute to the accuracy of genomic predictions. The objective of the study was to identify the impact of historically influential sires on the recent population. This was tested by constructing a genomic relationship matrix using recursion with different sets of sires. Differences in prediction accuracies with different sets are indicative of how important each set is. Recursion coefficients linking young animals to those sets reveal the relative importance of specific sires to the prediction accuracy of recent animals. The data included ∼10 million scores for stature and fore udder attachment (FUA) measured from 1983. Genotypes of 569,404 animals were available. Sire sets included the 100 most popular sires born within different time periods. Computations were with single-step genomic BLUP. In general, the younger sires had higher prediction accuracies than the oldest sires, even though they generally have fewer progeny. The accuracy of evaluation for stature was increased from 0.54 with the most popular sires born before 1981 to 0.69 with sires born from 2001 to 2010, while the accuracy for FUA increased from 0.47 to 0.61. The accuracy achieved using the overall 100 most used sires was 0.66 for stature and 0.58 for FUA. All 100 sires from each period were combined in a subset to determine the importance of each sire relative to all 400 animals in the combined subset. The highest relative impact of a sire that was born within the different time sets was 1.97 for Valiant (before 1981), 1.94 for Blackstar (1981 to 1990), 4.38 for Shottle (1991 to 2000), and 3.09 for Planet (2001 to 2010). The 3 sires among the 400 with the greatest impact were Shottle, Goldwyn (3.73), and Planet. The relative impact of a sire was not strongly related to the number of progeny. For instance, the relative impact of Durham with 34K progeny was 2.29, whereas the impact of O Man with 15K progeny was 3.13. The impact of a sire is also influenced by whether it was used as a sire of sires. Results show that younger sires are more relevant to the accuracy of breeding value prediction in the recent population.

8.
Genet Sel Evol ; 55(1): 55, 2023 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-37495982

RESUMO

BACKGROUND: Whole-genome sequence (WGS) data harbor causative variants that may not be present in standard single nucleotide polymorphism (SNP) chip data. The objective of this study was to investigate the impact of using preselected variants from WGS for single-step genomic predictions in maternal and terminal pig lines with up to 1.8k sequenced and 104k sequence imputed animals per line. METHODS: Two maternal and four terminal lines were investigated for eight and seven traits, respectively. The number of sequenced animals ranged from 1365 to 1491 for the maternal lines and 381 to 1865 for the terminal lines. Imputation to sequence occurred within each line for 66k to 76k animals for the maternal lines and 29k to 104k animals for the terminal lines. Two preselected SNP sets were generated based on a genome-wide association study (GWAS). Top40k included the SNPs with the lowest p-value in each of the 40k genomic windows, and ChipPlusSign included significant variants integrated into the porcine SNP chip used for routine genotyping. We compared the performance of single-step genomic predictions between using preselected SNP sets assuming equal or different variances and the standard porcine SNP chip. RESULTS: In the maternal lines, ChipPlusSign and Top40k showed an average increase in accuracy of 0.6 and 4.9%, respectively, compared to the regular porcine SNP chip. The greatest increase was obtained with Top40k, particularly for fertility traits, for which the initial accuracy based on the standard SNP chip was low. However, in the terminal lines, Top40k resulted in an average loss of accuracy of 1%. ChipPlusSign provided a positive, although small, gain in accuracy (0.9%). Assigning different variances for the SNPs slightly improved accuracies when using variances obtained from BayesR. However, increases were inconsistent across the lines and traits. CONCLUSIONS: The benefit of using sequence data depends on the line, the size of the genotyped population, and how the WGS variants are preselected. When WGS data are available on hundreds of thousands of animals, using sequence data presents an advantage but this remains limited in pigs.


Assuntos
Estudo de Associação Genômica Ampla , Genoma , Animais , Suínos/genética , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Genótipo , Fenótipo , Polimorfismo de Nucleotídeo Único
9.
Genet Sel Evol ; 55(1): 49, 2023 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-37460964

RESUMO

BACKGROUND: Identifying true positive variants in genome-wide associations (GWA) depends on several factors, including the number of genotyped individuals. The limited dimensionality of genomic information may give insights into the optimal number of individuals to be used in GWA. This study investigated different discovery set sizes based on the number of largest eigenvalues explaining a certain proportion of variance in the genomic relationship matrix (G). In addition, we investigated the impact on the prediction accuracy by adding variants, which were selected based on different set sizes, to the regular single nucleotide polymorphism (SNP) chips used for genomic prediction. METHODS: We simulated sequence data that included 500k SNPs with 200 or 2000 quantitative trait nucleotides (QTN). A regular 50k panel included one in every ten simulated SNPs. Effective population size (Ne) was set to 20 or 200. GWA were performed using a number of genotyped animals equivalent to the number of largest eigenvalues of G (EIG) explaining 50, 60, 70, 80, 90, 95, 98, and 99% of the variance. In addition, the largest discovery set consisted of 30k genotyped animals. Limited or extensive phenotypic information was mimicked by changing the trait heritability. Significant and large-effect size SNPs were added to the 50k panel and used for single-step genomic best linear unbiased prediction (ssGBLUP). RESULTS: Using a number of genotyped animals corresponding to at least EIG98 allowed the identification of QTN with the largest effect sizes when Ne was large. Populations with smaller Ne required more than EIG98. Furthermore, including genotyped animals with a higher reliability (i.e., a higher trait heritability) improved the identification of the most informative QTN. Prediction accuracy was highest when the significant or the large-effect SNPs representing twice the number of simulated QTN were added to the 50k panel. CONCLUSIONS: Accurately identifying causative variants from sequence data depends on the effective population size and, therefore, on the dimensionality of genomic information. This dimensionality can help identify the most suitable sample size for GWA and could be considered for variant selection, especially when resources are restricted. Even when variants are accurately identified, their inclusion in prediction models has limited benefits.


Assuntos
Estudo de Associação Genômica Ampla , Modelos Genéticos , Animais , Reprodutibilidade dos Testes , Genoma , Genômica , Genótipo , Fenótipo , Polimorfismo de Nucleotídeo Único
10.
Front Genet ; 14: 1163626, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37252662

RESUMO

Genomic evaluations in pigs could benefit from using multi-line data along with whole-genome sequencing (WGS) if the data are large enough to represent the variability across populations. The objective of this study was to investigate strategies to combine large-scale data from different terminal pig lines in a multi-line genomic evaluation (MLE) through single-step GBLUP (ssGBLUP) models while including variants preselected from whole-genome sequence (WGS) data. We investigated single-line and multi-line evaluations for five traits recorded in three terminal lines. The number of sequenced animals in each line ranged from 731 to 1,865, with 60k to 104k imputed to WGS. Unknown parent groups (UPG) and metafounders (MF) were explored to account for genetic differences among the lines and improve the compatibility between pedigree and genomic relationships in the MLE. Sequence variants were preselected based on multi-line genome-wide association studies (GWAS) or linkage disequilibrium (LD) pruning. These preselected variant sets were used for ssGBLUP predictions without and with weights from BayesR, and the performances were compared to that of a commercial porcine single-nucleotide polymorphisms (SNP) chip. Using UPG and MF in MLE showed small to no gain in prediction accuracy (up to 0.02), depending on the lines and traits, compared to the single-line genomic evaluation (SLE). Likewise, adding selected variants from the GWAS to the commercial SNP chip resulted in a maximum increase of 0.02 in the prediction accuracy, only for average daily feed intake in the most numerous lines. In addition, no benefits were observed when using preselected sequence variants in multi-line genomic predictions. Weights from BayesR did not help improve the performance of ssGBLUP. This study revealed limited benefits of using preselected whole-genome sequence variants for multi-line genomic predictions, even when tens of thousands of animals had imputed sequence data. Correctly accounting for line differences with UPG or MF in MLE is essential to obtain predictions similar to SLE; however, the only observed benefit of an MLE is to have comparable predictions across lines. Further investigation into the amount of data and novel methods to preselect whole-genome causative variants in combined populations would be of significant interest.

11.
J Anim Sci ; 1012023 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-37249185

RESUMO

In broiler breeding, superior individuals for growth become parents and are later evaluated for reproduction in an independent evaluation; however, ignoring broiler data can produce inaccurate and biased predictions. This research aimed to determine the most accurate, unbiased, and time-efficient approach for jointly evaluating reproductive and broiler traits. The data comprised a pedigree with 577K birds, 146K genotypes, phenotypes for three reproductive (egg production [EP], fertility [FE], hatch of fertile eggs [HF]; 9K each), and four broiler traits (body weight [BW], breast meat percent [BP], fat percent [FP], residual feed intake [RF]; up to 467K). Broiler data were added sequentially to assess the impact on the quality of predictions for reproductive traits. The baseline scenario (RE) included pedigrees, genotypes, and phenotypes for reproductive traits of selected animals; in RE2, we added their broiler phenotypes; in RE_BR, broiler phenotypes of nonselected animals, and in RE_BR_GE, their genotypes. We computed accuracy, bias, and dispersion of predictions for hens from the last two breeding cycles and their sires. We tested three core definitions for the algorithm of proven and young to find the most time-efficient approach: two random cores with 7K and 12K animals and one with 19K animals, containing parents and young animals. From RE to RE_BR_GE, changes in accuracy were null or minimal for EP (0.51 in hens, 0.59 in roosters) and HF (0.47 in hens, 0.49 in roosters); for FE in hens (roosters), it changed from 0.4 (0.49) to 0.47 (0.53). In hens (roosters), bias (additive SD units) decreased from 0.69 (0.7) to 0.04 (0.05) for EP, 1.48 (1.44) to 0.11 (0.03) for FE, and 1.06 (0.96) to 0.09 (0.02) for HF. Dispersion remained stable in hens (roosters) at ~0.93 (~1.03) for EP, and it improved from 0.57 (0.72) to 0.87 (1.0) for FE and from 0.8 (0.79) to 0.88 (0.87) for HF. Ignoring broiler data deteriorated the predictions' quality. The impact was significant for the low heritability trait (0.02; FE); bias (up to 1.5) and dispersion (as low as 0.57) were farther from the ideal value, and accuracy losses were up to 17.5%. Accuracy was maintained in traits with moderate heritability (~0.3; EP and HF), and bias and dispersion were less substantial. Adding information from the broiler phase maximized accuracy and unbiased predictions. The most time-efficient approach is a random core with 7K animals in the algorithm for proven and young.


In breeding programs with sequential selection, the estimation of breeding values becomes biased and inaccurate if the information from the past selection is ignored. We investigated the impact of incorporating broiler data (traits for past selection) into the evaluation of broiler reproductive traits. Including all the information increased the computing demands; therefore, we tested three core definitions for the algorithm for proven and young to determine the most accurate, unbiased, and time-efficient approach for jointly evaluating broiler and reproductive traits. When we ignored broiler data, the estimated breeding values for reproductive traits were biased (up to ~1.5 additive standard deviations). For low heritability traits, accuracy was reduced by up to 17.5%, and breeding values were overestimated (dispersion ~ 0.6). In contrast, incorporating broiler data eliminated bias and overestimation; and it maximized accuracy. A random core definition for the algorithm for proven and young with a number of animals equal to the number of the largest eigenvalues explaining 99% of the variation in the genomic relationship matrix is the most time-efficient, keeping accurate and unbiased predictions in the joint evaluation of broiler and reproductive traits.


Assuntos
Galinhas , Óvulo , Animais , Feminino , Masculino , Galinhas/genética , Genoma , Genômica , Genótipo , Fenótipo , Linhagem , Modelos Genéticos
13.
Genet Sel Evol ; 55(1): 6, 2023 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-36690938

RESUMO

BACKGROUND: Reliabilities of best linear unbiased predictions (BLUP) of breeding values are defined as the squared correlation between true and estimated breeding values and are helpful in assessing risk and genetic gain. Reliabilities can be computed from the prediction error variances for models with a single base population but are undefined for models that include several base populations and when unknown parent groups are modeled as fixed effects. In such a case, the use of metafounders in principle enables reliabilities to be derived. METHODS: We propose to compute the reliability of the contrast of an individual's estimated breeding value with that of a metafounder based on the prediction error variances of the individual and the metafounder, their prediction error covariance, and their genetic relationship. Computation of the required terms demands only little extra work once the sparse inverse of the mixed model equations is obtained, or they can be approximated. This also allows the reliabilities of the metafounders to be obtained. We studied the reliabilities for both BLUP and single-step genomic BLUP (ssGBLUP), using several definitions of reliability in a large dataset with 1,961,687 dairy sheep and rams, most of which had phenotypes and among which 27,000 rams were genotyped with a 50K single nucleotide polymorphism (SNP) chip. There were 23 metafounders with progeny sizes between 100,000 and 2000 individuals. RESULTS: In models with metafounders, directly using the prediction error variance instead of the contrast with a metafounder leads to artificially low reliabilities because they refer to a population with maximum heterozygosity. When only one metafounder is fitted in the model, the reliability of the contrast is shown to be equivalent to the reliability of the individual in a model without metafounders. When there are several metafounders in the model, using a contrast with the oldest metafounder yields reliabilities that are on a meaningful scale and very close to reliabilities obtained from models without metafounders. The reliabilities using contrasts with ssGBLUP also resulted in meaningful values. CONCLUSIONS: This work provides a general method to obtain reliabilities for both BLUP and ssGBLUP when several base populations are included through metafounders.


Assuntos
Genoma , Modelos Genéticos , Animais , Masculino , Ovinos , Reprodutibilidade dos Testes , Genótipo , Genômica/métodos , Fenótipo , Linhagem
14.
J Anim Breed Genet ; 140(1): 60-78, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35946919

RESUMO

Single-step genomic BLUP (ssGBLUP) relies on the combination of the genomic ( G $$ \mathbf{G} $$ ) and pedigree relationship matrices for all ( A $$ \mathbf{A} $$ ) and genotyped ( A 22 $$ {\mathbf{A}}_{22} $$ ) animals. The procedure ensures G $$ \mathbf{G} $$ and A 22 $$ {\mathbf{A}}_{22} $$ are compatible so that both matrices refer to the same genetic base ('tuning'). Then G $$ \mathbf{G} $$ is combined with a proportion of A 22 $$ {\mathbf{A}}_{22} $$ ('blending') to avoid singularity problems and to account for the polygenic component not accounted for by markers. This computational procedure has been implemented in the reverse order (blending before tuning) following the sequential research developments. However, blending before tuning may result in less optimal tuning because the blended matrix already contains a proportion of A 22 $$ {\mathbf{A}}_{22} $$ . In this study, the impact of 'tuning before blending' was compared with 'blending before tuning' on genomic estimated breeding values (GEBV), single nucleotide polymorphism (SNP) effects and indirect predictions (IP) from ssGBLUP using American Angus Association and Holstein Association USA, Inc. data. Two slightly different tuning methods were used; one that adjusts the mean diagonals and off-diagonals of G $$ \mathbf{G} $$ to be similar to those in A 22 $$ {\mathbf{A}}_{22} $$ and another one that adjusts based on the average difference between all elements of G $$ \mathbf{G} $$ and A 22 $$ {\mathbf{A}}_{22} $$ . Over 6 million Angus growth records and 5.9 million Holstein udder depth records were available. Genomic information was available on 51,478 Angus and 105,116 Holstein animals. Average realized relationship estimates among groups of animals were similar across scenarios. Scatterplots show that GEBV, SNP effects and IP did not noticeably change for all animals in the evaluation regardless of the order of computations and when using blending parameter of 0.05. Formulas were derived to determine the blending parameter that maximizes changes in the genomic relationship matrix and GEBV when changing the order of blending and tuning. Algebraically, the change is maximized when the blending parameter is equal to 0.5. Overall, tuning G $$ \mathbf{G} $$ before blending, regardless of blending parameter used, had a negligible impact on genomic predictions and SNP effects in this study.


Assuntos
Genômica , Animais
15.
JDS Commun ; 3(5): 343-347, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-36340904

RESUMO

Evaluations using single-step genomic BLUP require blending the genomic relationship matrix (G) with a positive definite matrix to ensure nonsingularity for solving the mixed model equations. Many organizations blend G with a proportion of the numerator relationship matrix for genotyped animals (A 22) to improve stability and possibly add a residual polygenic effect. However, when nearly all the polygenic variance is explained by G, blending with A 22 may cause inflation and add excess computing time; thus, blending with an identity matrix (I) multiplied by a small value may be a better solution. The objective of this study was to evaluate changes in reliability and inflation of genomic estimated breeding values, convergence rate, elapsed wall-clock time for blending G with different levels of A 22 or I, and develop a more time-efficient blending method. A US Holstein cattle data set was used with 9.7 million animals in the pedigree, 569,404 animals with genotypes, and 10.1 million stature phenotypes. Blending G by adding a small value to the diagonal elements had comparable performance to A 22 with fewer rounds to convergence required to solve the system of equations. Reliability and inflation of genomic estimated breeding values ranged from 0.63 to 0.68 and 0.86 to 0.89 for all blending scenarios tested. The current blending default in the BLUPF90 software is to replace G with (1 - ß)G + ßA 22, where ß equals 0.05. In this study, ß values of 0.30, 0.20, 0.05, 0.01, 0.005, and 0.001 were evaluated with A 22 and I. Negligible differences in elapsed computing time between the blending types and levels were observed. Subsequently, the current blending algorithm used in the BLUPF90 family of programs was optimized, reducing the blending time from approximately 2 h to 5 min for A 22 and less than 1 s for I. The new time difference between blending with A 22 or I is negligible and not computationally critical. The results indicate that blending G with A 22 does not have clear advantages over blending with a small proportion of I.

16.
Genet Sel Evol ; 54(1): 66, 2022 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-36162979

RESUMO

BACKGROUND: Although single-step GBLUP (ssGBLUP) is an animal model, SNP effects can be backsolved from genomic estimated breeding values (GEBV). Predicted SNP effects allow to compute indirect prediction (IP) per individual as the sum of the SNP effects multiplied by its gene content, which is helpful when the number of genotyped animals is large, for genotyped animals not in the official evaluations, and when interim evaluations are needed. Typically, IP are obtained for new batches of genotyped individuals, all of them young and without phenotypes. Individual (theoretical) accuracies for IP are rarely reported, but they are nevertheless of interest. Our first objective was to present equations to compute individual accuracy of IP, based on prediction error covariance (PEC) of SNP effects, and in turn, are obtained from PEC of GEBV in ssGBLUP. The second objective was to test the algorithm for proven and young (APY) in PEC computations. With large datasets, it is impossible to handle the full PEC matrix, thus the third objective was to examine the minimum number of genotyped animals needed in PEC computations to achieve IP accuracies that are equivalent to GEBV accuracies. RESULTS: Correlations between GEBV and IP for the validation animals using SNP effects from ssGBLUP evaluations were ≥ 0.99. When all available genotyped animals were used for PEC computations, correlations between GEBV and IP accuracy were ≥ 0.99. In addition, IP accuracies were compatible with GEBV accuracies either with direct inversion of the genomic relationship matrix (G) or using the algorithm for proven and young (APY) to obtain the inverse of G. As the number of genotyped animals included in the PEC computations decreased from around 55,000 to 15,000, correlations were still ≥ 0.96, but IP accuracies were biased downwards. CONCLUSIONS: Theoretical accuracy of indirect prediction can be successfully obtained by computing SNP PEC out of GEBV PEC from ssGBLUP equations using direct or APY G inverse. It is possible to reduce the number of genotyped animals in PEC computations, but accuracies may be underestimated. Further research is needed to approximate SNP PEC from ssGBLUP to limit the computational requirements with many genotyped animals.


Assuntos
Genoma , Modelos Genéticos , Animais , Genômica , Genótipo , Linhagem , Fenótipo
17.
Genet Sel Evol ; 54(1): 52, 2022 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-35842585

RESUMO

BACKGROUND: Single-step genomic predictions obtained from a breeding value model require calculating the inverse of the genomic relationship matrix [Formula: see text]. The Algorithm for Proven and Young (APY) creates a sparse representation of [Formula: see text] with a low computational cost. APY consists of selecting a group of core animals and expressing the breeding values of the remaining animals as a linear combination of those from the core animals plus an error term. The objectives of this study were to: (1) extend APY to marker effects models; (2) derive equations for marker effect estimates when APY is used for breeding value models, and (3) show the implication of selecting a specific group of core animals in terms of a marker effects model. RESULTS: We derived a family of marker effects models called APY-SNP-BLUP. It differs from the classic marker effects model in that the row space of the genotype matrix is reduced and an error term is fitted for non-core animals. We derived formulas for marker effect estimates that take this error term in account. The prediction error variance (PEV) of the marker effect estimates depends on the PEV for core animals but not directly on the PEV of the non-core animals. We extended the APY-SNP-BLUP to include a residual polygenic effect and accommodate non-genotyped animals. We show that selecting a specific group of core animals is equivalent to select a subspace of the row space of the genotype matrix. As the number of core animals increases, subspaces corresponding to different sets of core animals tend to overlap, showing that random selection of core animals is algebraically justified. CONCLUSIONS: The APY-(ss)GBLUP models can be expressed in terms of marker effect models. When the number of core animals is equal to the rank of the genotype matrix, APY-SNP-BLUP is identical to the classic marker effects model. If the number of core animals is less than the rank of the genotype matrix, genotypes for non-core animals are imputed as a linear combination of the genotypes of the core animals. For estimating SNP effects, only relationships and estimated breeding values for core animals are needed.


Assuntos
Genoma , Modelos Genéticos , Algoritmos , Animais , Genômica , Genótipo , Linhagem , Fenótipo
18.
J Anim Breed Genet ; 139(4): 367-368, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35674365
19.
Genet Sel Evol ; 54(1): 34, 2022 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-35596130

RESUMO

BACKGROUND: The algorithm for proven and young (APY) has been suggested as a solution for recursively computing a sparse representation for the inverse of a large genomic relationship matrix (G). In APY, a subset of genotyped individuals is used as the core and the remaining genotyped individuals are used as noncore. Size and definition of the core are relevant research subjects for the application of APY, especially given the ever-increasing number of genotyped individuals. METHODS: The aim of this study was to investigate several core definitions, including the most popular animals (MPA) (i.e., animals with high contributions to the genetic pool), the least popular males (LPM), the least popular females (LPF), a random set (Rnd), animals evenly distributed across genealogical paths (Ped), unrelated individuals (Unrel), or based on within-family selection (Fam), or on decomposition of the gene content matrix (QR). Each definition was evaluated for six core sizes based on prediction accuracy of single-step genomic best linear unbiased prediction (ssGBLUP) with APY. Prediction accuracy of ssGBLUP with the full inverse of G was used as the baseline. The dataset consisted of 357k pedigreed Duroc pigs with 111k pigs with genotypes and ~ 220k phenotypic records. RESULTS: When the core size was equal to the number of largest eigenvalues explaining 50% of the variation of G (n = 160), MPA and Ped core definitions delivered the highest average prediction accuracies (~ 0.41-0.53). As the core size increased to the number of eigenvalues explaining 99% of the variation in G (n = 7320), prediction accuracy was nearly identical for all core types and correlations with genomic estimated breeding values (GEBV) from ssGBLUP with the full inversion of G were greater than 0.99 for all core definitions. Cores that represent all generations, such as Rnd, Ped, Fam, and Unrel, were grouped together in the hierarchical clustering of GEBV. CONCLUSIONS: For small core sizes, the definition of the core matters; however, as the size of the core reaches an optimal value equal to the number of largest eigenvalues explaining 99% of the variation of G, the definition of the core becomes arbitrary.


Assuntos
Genoma , Modelos Genéticos , Algoritmos , Animais , Feminino , Genômica , Genótipo , Humanos , Masculino , Linhagem , Fenótipo , Suínos
20.
J Anim Sci ; 100(5)2022 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-35289906

RESUMO

Efficient computing techniques allow the estimation of variance components for virtually any traditional dataset. When genomic information is available, variance components can be estimated using genomic REML (GREML). If only a portion of the animals have genotypes, single-step GREML (ssGREML) is the method of choice. The genomic relationship matrix (G) used in both cases is dense, limiting computations depending on the number of genotyped animals. The algorithm for proven and young (APY) can be used to create a sparse inverse of G (GAPY~-1) with close to linear memory and computing requirements. In ssGREML, the inverse of the realized relationship matrix (H-1) also includes the inverse of the pedigree relationship matrix, which can be dense with a long pedigree, but sparser with short. The main purpose of this study was to investigate whether costs of ssGREML can be reduced using APY with truncated pedigree and phenotypes. We also investigated the impact of truncation on variance components estimation when different numbers of core animals are used in APY. Simulations included 150K animals from 10 generations, with selection. Phenotypes (h2 = 0.3) were available for all animals in generations 1-9. A total of 30K animals in generations 8 and 9, and 15K validation animals in generation 10 were genotyped for 52,890 SNP. Average information REML and ssGREML with G-1 and GAPY~-1 using 1K, 5K, 9K, and 14K core animals were compared. Variance components are impacted when the core group in APY represents the number of eigenvalues explaining a small fraction of the total variation in G. The most time-consuming operation was the inversion of G, with more than 50% of the total time. Next, numerical factorization consumed nearly 30% of the total computing time. On average, a 7% decrease in the computing time for ordering was observed by removing each generation of data. APY can be successfully applied to create the inverse of the genomic relationship matrix used in ssGREML for estimating variance components. To ensure reliable variance component estimation, it is important to use a core size that corresponds to the number of largest eigenvalues explaining around 98% of total variation in G. When APY is used, pedigrees can be truncated to increase the sparsity of H and slightly reduce computing time for ordering and symbolic factorization, with no impact on the estimates.


The estimation of variance components is computationally expensive under large-scale genetic evaluations due to several inversions of the coefficient matrix. Variance components are used as parameters for estimating breeding values in mixed model equations (MME). However, resulting breeding values are not Best Linear Unbiased Predictions (BLUP) unless the variance components approach the true parameters. The increasing availability of genomic data requires the development of new methods for improving the efficiency of variance component estimations. Therefore, this study aimed to reduce the costs of single-step genomic REML (ssGREML) with the Algorithm for Proven and Young (APY) for estimating variance components with truncated pedigree and phenotypes using simulated data. In addition, we investigated the influence of truncation on variance components and genetic parameter estimates. Under APY, the size of the core group influences the similarity of breeding values and their reliability compared to the full genomic matrix. In this study, we found that to ensure reliable variance component estimation, it is required to consider a core size that corresponds to the number of largest eigenvalues explaining around 98% of the total variation in G to avoid biased parameters. In terms of costs, the use of APY slightly decreased the time for ordering and symbolic factorization with no impact on estimations.


Assuntos
Genoma , Modelos Genéticos , Algoritmos , Animais , Genômica/métodos , Genótipo , Linhagem , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA